Lexical Semantics Domain Model for Information Extraction

نویسنده

  • Patricia Lutsky
چکیده

The domain of operating system reference manuals uses linguistic constructs that are difficult to process since many terms are similar and most concepts are abstract software engineering constructs. The SIFT system for automatic test generation from these documents uses a natural-languagebased formalism for software domain models. The formalism is based on the generative lexicon framework (Pustejovsky 1995). Examples show how this model is used for information extraction from texts in the software

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A protocol for constructing a domain-specific ontology for use in biomedical information extraction using lexical-chaining analysis

In order to do more semantics-based information extraction, we require specialized domain models. We develop a hybrid approach for constructing such a domain-specific ontology, which integrates key concepts from the protein-protein– interaction domain with the Gene Ontology. In addition, we present a method for using the domain-specific ontology in a discourse-based analysis module for analyzin...

متن کامل

Semantic Based Information Extraction from Web

Extraction of information from web is a challenging task. The information stored in a web may be structured or unstructured information. The structured information provides enhanced knowledge which helps to retrieve relevant documents. It helps the user to understand particular domain. This paper explores the importance of information extraction using semantics. It enables the users to discover...

متن کامل

Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

We propose a robust answer reranking model for non-factoid questions that integrates lexical semantics with discourse information, driven by two representations of discourse: a shallow representation centered around discourse markers, and a deep one based on Rhetorical Structure Theory. We evaluate the proposed model on two corpora from different genres and domains: one from Yahoo! Answers and ...

متن کامل

A lexicon for biology and bioinformatics: the BOOTStrep experience

This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding differ...

متن کامل

Lexical Semantics and Selection of TAM in Bantu Languages: A Case of Semantic Classification of Kiswahili Verbs

The existing literature on Bantu verbal semantics demonstrated that inherent semantic content of verbs pairs directly with the selection of tense, aspect and modality formatives in Bantu languages like Chasu, Lucazi, Lusamia, and Shiyeyi. Thus, the gist of this paper is the articulation of semantic classification of verbs in Kiswahili based on the selection of TAM types. This is because the sem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004